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Abstract 


Anomaly  detection  has  been  used  successfully  on  hyperspectral  images  for  over  a 
decade.  However,  there  is  an  ever  increasing  need  for  real-time  anomaly  detectors. 
Historically,  anomaly  detection  methods  have  focused  on  analysis  after  the  entire  image 
has  been  collected.  As  useful  as  post-collection  anomaly  detection  is,  there  is  a  great 
advantage  to  detecting  an  anomaly  as  it  is  being  collected. 

This  research  is  focused  on  speeding  up  the  process  of  detection  for  a  pre-existing 
method,  Linear  RX,  which  is  a  variation  on  the  traditional  Reed-Xiaoli  detector.  By 
speeding  up  the  process  of  detection,  it  is  possible  to  create  a  real-time  anomaly  detector. 
The  window  covariance  matrix  is  our  main  area  focus  for  speed  improvement.  Several 
methods  were  investigated,  including  QR  factorization  and  tracking  the  change  in  the 
window  covariance  matrix  as  it  moves  through  the  image.  Finally,  performance 
comparisons  are  made  to  the  original  Linear  RX  detector. 
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REAL-TIME  ANOMALY  DETECTION  OF  HYPERSPECTRAL  IMAGES 


1.  Introduction 


1.1  General  Issue 

As  sensor  technology  advances,  the  amount  of  data  produced  is  ever  increasing. 
This  massive  amount  of  data  makes  anomaly  detection  by  human  eyes  alone  virtually 
impossible.  Anomaly  detection  by  humans  alone  is  also  impractical,  since  the  human  eye 
can  be  tricked  by  methods  such  as  camouflage.  Because  of  this,  anomaly  detection 
methods  are  being  created  to  help  analyze  images  in  an  accurate  and  timely  manner. 

Historically,  anomaly  detection  methods  have  focused  on  analysis  after  the  entire 
image  has  been  collected.  As  useful  as  post-collection  anomaly  detection  is,  there  is  a 
great  advantage  to  detecting  an  anomaly  as  the  data  is  being  collected.  The  faster  an 
anomaly  is  detected,  the  sooner  something  can  be  done  about  it.  By  speeding  up  the 
process  of  detection,  it  is  within  the  realm  of  possibility  to  create  an  anomaly  detector 
that  works  in  real-time. 

1.2  Methodology 

One  common  anomaly  detector  is  the  RX  detector  (Reed  &  Yu,  1990).  This 
detector  decides  whether  each  pixel  is  a  statistical  anomaly  or  not  by  comparing  a  pixel’s 
score  to  a  statistical  value.  This  pixel’s  score  is  computed  using  an  inverse  covariance 
matrix.  Linear  RX  (Williams,  Bihl,  &  Bauer)  is  another  anomaly  detector  that  uses  a 
similar  technique  of  scoring  using  an  inverse  covariance  matrix.  It  is  well  known  that 
taking  the  inverse  of  a  large  matrix,  like  a  covariance  matrix,  can  be  very  computationally 
intensive  and  time  consuming.  However,  it  has  been  shown  in  (Chang,  Ren,  &  Chiang, 


1 


2001)  and  (Du  &  Zhang,  2011)  that  by  using  QR  factorization,  the  inverse  of  a 
covariance  matrix  can  be  found  in  a  more  timely  manner.  In  this  research,  we  will  show 
that  QR  factorization  can  indeed  speed  up  Linear  RX  and  that  the  qualities  of  Linear  RX 
can  be  taken  advantage  of  in  a  real-time  processing  setting. 

1.3  Preview 

Chapter  2  explains  some  of  the  basics  of  hyperspectral  images  and  anomaly 
detection.  It  also  contains  information  on  some  current  real-time  anomaly  detectors  as 
well  as  the  concepts  used  in  the  implementation  and  analysis  of  this  research.  Chapter  3 
details  how  QR  factorization  is  implemented  in  order  to  speed  up  the  detection  process. 
Chapter  4  includes  the  analysis  and  results  of  this  new  method.  Finally,  Chapter  5 
provides  an  overview  of  the  work  completed  in  this  paper  as  well  as  recommendations  for 
future  work. 
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2.  Literature  Review 


This  chapter  outlines  current  practices  in  hyperspectral  imaging  (HSI)  and 
anomaly  detection  as  well  as  mathematical  methods  used  in  this  thesis.  This  chapter  is 
organized  into  six  sections:  HSI  basics,  Anomaly  Detection,  Principal  Components 
Analysis,  Normalized  Difference  Vegetation  Index,  QR  Factorization,  and  Performance 
Measurements. 

2.1  HSI  Basics 

All  images  capture  some  portion  of  the  electromagnetic  (EM)  spectrum.  Images 
taken  with  a  common  digital  camera  capture  the  visible  portion  of  the  EM  spectrum  in 
three  discrete  wavelength  bands:  red,  green  and  blue.  In  a  similar  manner,  multispectral 
images  capture  discrete  wavelength  bands,  but  feature  multiple  bands  across  several  EM 
regions.  Hyperspectral  images  are  similar  in  concept  to  multispectral  images  except  for 
some  key  differences.  One  of  the  main  differences  is  that  instead  of  capturing  discrete 
wavelength  bands,  hyperspectral  images  capture  a  finely  sampled  contiguous  region  of 
the  EM  spectrum,  which  is  then  broken  down  into  many  bands.  Hyperspectral  images  are 
comprised  of  20  or  more  wavelength  bands  (Stein,  Beaven,  Hoff,  Winter,  Schaum,  & 
Stocker,  2002).  Figure  1  illustrates  the  EM  spectrum  and  the  various  regions  of  the 
spectrum. 
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Figure  1:  Electromagnetic  Spectrum.  Reprinted  from  (Landgrebe  D.  A.,  2003) 


All  of  this  information  must  be  properly  organized  and  the  two  aspects  of  the  data 
that  must  be  captured:  spatial  location  and  the  spectral  wavelength  band.  Typically  the 
hyperspectral  image  data  is  organized  into  an  ‘image  cube’  (Smeteck  &  Bauer,  2008). 
This  3-dimensional  data  array  has  height  m,  width  n,  and  spectral  k  dimensions.  In  this 
way,  the  spatial  location  is  recorded  as  a  pixel,  using  the  height  and  width  dimensions, 
and  the  spectral  wavelength  band  is  captured  in  the  third  dimension.  This  image  cube  can 
be  thought  of  as  a  stack  of  k  images  of  size  mxn,  where  each  image  is  a  representation  of 
the  same  physical  area,  but  on  different  spectral  bands.  Figure  2  demonstrates  this 
concept. 
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Figure  2:  Illustration  of  Matrices  within  an  Image  Cube.  Reprinted  from  (Miller,  2009) 

Hyperspectral  imaging  is  a  powerful  tool  because  it  takes  advantage  of  the  fact 
that  different  materials  reflect,  absorb  and  emit  electromagnetic  energy  differently  due  to 
each  material’s  molecular  composition  (Manolakis  &  Shaw,  2002),  (Eismann,  2011). 
Theoretically  each  material  has  a  unique  reflection  and  radiation  pattern,  which  is 
sometimes  called  a  material’s  spectral  signature.  Because  of  this,  different  materials  can 
be  identified  by  their  spectral  signature.  For  example,  trees  will  reflect  and  radiate 
electromagnetic  energy  differently  than  a  parking  lot  and  therefore  will  have  a  different 
spectral  signature.  By  capturing  these  spectral  signatures,  HSI  is  useful  in  a  variety  of 
applications  ranging  from  environmental  monitoring  to  surveillance  (Stein,  Beaven,  Hoff, 
Winter,  Schaum,  &  Stocker,  2002). 
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This  paper  will  assume  that  the  HSI  data  is  collected  from  a  “push  broom”  sensor. 


This  type  of  sensor  collects  data  one  line  at  a  time  as  the  sensor  platform  (i.e.  satellite, 
plane,  or  UAV)  flies  over  the  area  of  interest. 


Figure  3:  “Pushbroom”  Data  Collection.  Reprinted  from  (Bihl) 


2.2  Anomaly  Detection 

Anomaly  detection  and  recognition  are  relatively  common  applications  of  HSI. 
There  are  two  general  types  of  detectors:  anomaly  detection  algorithms  and  signature 
matching  algorithms  (Stein,  Beaven,  Hoff,  Winter,  Schaum,  &  Stocker,  2002),  (Eismann, 
2011).  Signature  matching  algorithms  require  a  priori  information  on  what  type  of 
material  the  detector  is  searching  for.  This  can  prove  difficult,  since  the  type  of  target  is 
not  always  known.  Even  if  the  type  of  target  is  known,  atmospheric  conditions  distort 
spectral  readings,  which  then  affect  the  success  of  detection.  Anomaly  detection,  on  the 
other  hand,  does  not  require  a  priori  information  about  the  material  the  detector  is 
searching  for.  Instead,  anomaly  detectors  are  designed  to  find  anomalies  or  pixels  that  are 
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statistically  different  from  the  background  of  the  image  (Eismann,  2011),  (Stein,  Beaven, 
Hoff,  Winter,  Schaum,  &  Stocker,  2002).  It  is  important  to  note  that  these  detectors  only 
find  anomalous  objects;  they  do  not  classify  its  spectral  signature.  For  example,  if  there  is 
a  building  in  the  middle  of  a  field,  an  anomaly  detector  might  recognize  the  building  as 
an  anomaly,  but  would  not  know  that  it  is  a  building.  However,  if  the  image  contains 
many  buildings,  parking  lots,  and  streets,  then  the  detector  may  recognize  the  buildings  as 
background  and  grass  growing  in  a  sidewalk  crack  as  an  anomaly. 

Hyperspectral  images  are  better  suited  for  anomaly  detection  than  a  common 
RGB  image.  This  is  due  to  the  multiple  spectral  bands  within  which  an  anomaly  could  be 
detected.  Suppose  there  is  a  field  and  in  the  middle  of  this  field  there  is  a  crate  disguised 
with  camouflage.  If  we  run  an  anomaly  detection  algorithm  on  an  RGB  image  of  this 
field,  an  anomaly  may  not  be  detected.  The  camouflage  could  almost  completely  disguise 
the  crate  in  the  RGB  image,  but  it  would  not  be  so  easily  hidden  in  a  hyperspectral  image 
of  the  field.  The  anomaly  may  be  detected  on  a  different  spectral  band,  say  in  the  infrared 
region,  which  is  not  captured  in  the  RGB  image,  where  water  absorption  properties 
would  be  noticeably  different  between  the  camouflage  and  vegetation  in  the  field.  This  is 
why  HSI  is  more  effective  at  detecting  anomalies  than  common  RGB  images. 

There  are  three  types  of  anomaly  detection  techniques:  supervised,  semi- 
supervised,  and  unsupervised.  Training  data  of  both  the  background  and  the  anomaly  is 
needed  for  supervised  detection.  Semi-supervised  detection  only  requires  training  data  of 
the  background.  As  for  unsupervised  detection,  no  training  data  is  required,  just  as  the 
name  suggests.  Depending  on  the  situation,  supervised  and  semi- supervised  detection  can 
have  significant  issues,  since  backgrounds  and  anomalies  can  vary  drastically  from  image 
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to  image.  Because  of  this,  it  is  difficult  to  train  a  detector  on  one  image  and  use  it  on 
another  with  good  results. 

There  are  two  main  approaches  to  defining  the  background  of  the  image:  local 
and  global.  Global  anomaly  detection  defines  background  as  the  entire  image  excluding 
the  test  pixel.  Therefore,  global  methods  compare  a  given  test  pixel  to  the  rest  of  the 
image.  Local  anomaly  detection  defines  background  as  a  smaller  subset  of  the  image. 
Usually,  it  compares  the  test  pixel  to  a  window  of  pixels  around  the  test  pixel.  Both 
methods  have  their  advantages.  Global  anomaly  detection  is  less  susceptible  to  false 
alarms  than  local  anomaly  detection.  On  the  other  hand,  local  anomaly  detection  is  better 
at  finding  an  isolated  target  that  resembles  background  (Stein,  Beaven,  Hoff,  Winter, 
Schaum,  &  Stocker,  2002),  (Smeteck  &  Bauer,  2008). 

One  common  method  used  to  find  pixels  that  are  statistically  different  from  the 
background  is  to  measure  the  Mahalanobis  distance  between  the  pixel  and  the  mean 
vector  of  the  background  (Eismann,  2011).  This  distance  is  defined  as: 

D2  =  (x-/r)TS_1(x-/r)  (1 

where  x  is  the  test  pixel,  ji  is  the  mean  vector  of  the  background,  and  S  is  the  covariance 
matrix  of  the  background  (Dillon  &  Goldstein,  1984).  When  compared  to  Euclidean 
distance,  Mahalanobis  distance  has  the  distinct  advantage  of  accounting  for  any 
correlations  between  the  variables  (Dillon  &  Goldstein,  1984).  This  is  why  this  measure 
has  been  used  in  developing  several  anomaly  detectors,  including  the  Reed-Xiaoli 
detector. 
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2.2.1  Reed-Xiaoli  (RX)  Detector 

The  Reed-Xiaoli  (RX)  detector,  introduced  by  Reed  and  Yu  (Reed  &  Yu,  1990), 
is  an  unsupervised  local  anomaly  detector  based  on  the  Mahalanobis  distance.  Using  a 
moving  window,  the  RX  detector  identifies  anomalies  by  comparing  the  center  pixel  to 
the  rest  of  the  pixels  in  the  window,  as  illustrated  in  Figure  4.  This  center  pixel  is  given  a 
score  based  on  a  generalized  likelihood  ratio  test.  After  a  pixel  has  been  tested,  the 
window  is  moved  across  each  row,  one  pixel  at  a  time. 


Figure  4:  RX  Moving  Window.  Reprinted  from  (Williams,  Bihl,  &  Bauer) 


The  RX  detector  assumes  Gaussian  data  and  uses  the  following  Equation  2  as  the 
pixel  score 


RX(x)  =  (x  —  |u)7 


N 


N  + 


t)s  +  GvTt)(*-")(x-"); 


^-1 


(x-v) 


(2) 


where  x  is  the  test  pixel,  /i  is  the  mean  vector  of  the  window,  S  is  the  covariance  matrix 
of  the  window,  and  N  is  the  number  of  pixels  in  the  window.  For  a  given  tolerance  a,  if  a 
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RX(.i')  score  is  greater  than  x'z a  ^-iy  then  x  is  considered  an  anomaly.  Note  that  as  the 

window  size,  N,  gets  larger,  the  RX  score  approaches  the  Mahalanobis  distance. 

Choosing  an  appropriately  sized  window  for  the  RX  detector  can  prove  difficult. 
If  the  window  is  too  restrictive,  the  detector  will  not  pick  up  on  large  anomalies  because 
the  majority  of  the  window  is  filled  with  the  pixels  of  that  anomaly.  If  the  window  is  too 
large,  the  window  could  contain  several  anomalies  and  the  representation  of  the  original 
background  is  corrupted.  Because  of  this,  it  is  important  to  set  the  window  size  based  on 
what  sized  target  you  want  to  detect.  Variations  of  RX  exist  to  counteract  pixel  spatial 
proximity  issues.  These  methods  include  applying  a  guard  window  around  a  test  pixel,  so 
that  pixels  immediately  surrounding  the  test  pixel  are  not  included  in  the  background 
calculation  (Eismann,  2011),  and  different  geometric  window  shapes  in  order  to  increase 
the  spatial  distance  of  the  pixels  used  to  calculate  the  background  (Williams,  Bihl,  & 
Bauer). 

2.2.2  Linear  RX  Detector 

Linear  RX  (LRX),  developed  by  Williams,  Bihl,  and  Bauer,  is  a  variation  on  the 
classic  RX  detector  (Williams,  Bihl,  &  Bauer).  Instead  of  using  a  moving  window  around 
the  test  pixel,  LRX  looks  at  a  line  of  pixels  above  and  below  the  test  pixels.  If  there  are 
insufficient  pixels  above  or  below  the  test  pixel,  the  remaining  pixels  are  taken  from  the 
bottom  of  the  previous  column  or  the  top  of  the  next  column,  respectively,  as  shown  in 
Ligure  5. 
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Figure  5:  LRX  Moving  Column.  Reprinted  from  (Williams,  Bihl,  &  Bauer) 

LRX  was  created  to  address  the  issue  of  pixel  correlation  based  on  spatial 
proximity  (Williams,  Bihl,  &  Bauer).  For  example,  a  large  anomaly  that  is  about  the  size 
of  the  RX  window  will  not  be  picked  up  by  the  standard  RX  detector.  However,  the  LRX 
will  recognize  it  as  an  anomaly  since  it  is  comparing  the  test  pixel  to  pixels  across  a  wide 
range  of  the  image.  The  use  of  a  line  of  pixels  in  place  of  a  window  increases  the  average 
distance  between  the  pixels,  which  allows  for  reduction  of  bias  and  error  in  the  estimation 
of  the  mean  vector  and  covariance  matrix  (Williams,  Bihl,  &  Bauer). 

2.2.3  Real-time  Detectors 

There  is  an  ever  increasing  need  for  real-time  anomaly  detection  methods  (Du  & 
Zhang,  2011).  We  use  the  term  “real-time”  loosely,  since  there  really  is  no  definition  of 
what  makes  a  detection  method  real-time.  For  our  purposes,  if  the  detection  method  can 
be  carried  out  while  new  rows  of  data  are  added  and  it  is  at  least  as  fast  as  current  post¬ 
collection  methods,  we  will  consider  it  real-time. 
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There  are  two  current  real-time  detectors  that  will  be  used  for  comparison  for  this 
paper.  Both  are  presented  in  Stellman  et  al.  (Stellman,  Hazel,  Bucholtz,  Michalowicz, 
Stocker,  &  Schaaf,  2000). 

The  first  detector  is  a  real-time  version  of  the  RX  algorithm.  Each  time  there  is 
new  data,  a  new  mean  vector  p  and  a  new  covariance  matrix  S  are  calculated  using 
Equation  3  and  Equation  4  respectively.  Note  that  the  contribution  of  the  new  pixel  is 
weighted  by  a  as  defined  by  Equation  5. 

Mn  =  (1  -  a)l*n-l  +  azn  (3 

Sn  —  (1  —  Cf)Sn_1  +  a(l  —  (l)(zn  —  Mn-l)(2n  Abi-l)  (4 


a 


Neff  +  1 


(5) 


where,  zn  is  the  vector  of  spectral  components  for  the  /7th  pixel  and  Neff  is  the  sample 
average  (Stellman,  Hazel,  Bucholtz,  Michalowicz,  Stocker,  &  Schaaf,  2000).  The  inverse 
covariance  matrix  is  calculated  by: 


51  =  EA~1Et 


(6) 


where,  E  is  a  matrix  of  the  eigenvectors  of  the  covariance  matrix  with  the  form  E  — 

[elt ...  ej]  with  j  as  the  number  of  spectral  bands  (Stellman,  Hazel,  Bucholtz, 
Michalowicz,  Stocker,  &  Schaaf,  2000).  Also,  A  is  the  diagonal  matrix  with  j  of  the 
covariance  matrix  eigenvalues  on  the  diagonal.  Then  defining  principal  components  xn  as 

%n  —  E  (zn  —  /rn).  (7) 


The  RX  score  rn  is  calculated  using 
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(8) 


rn  =  (zn  ~  M n)TEA  1ET(zn  -  /rn)  =  xTnA  1xn  =  ^  !^5_1 

i-1  1 

where,  xn(i)  represents  the  zth  element  of  the  principal  component  xn  and  AL  is  the 
associated  eigenvalue. 

For  the  second  real-time  anomaly  detection  algorithm,  we  will  use  a  clustering 
algorithm  defined  in  (Stellman,  Hazel,  Bucholtz,  Michalowicz,  Stocker,  &  Schaaf,  2000). 
This  clustering  algorithm  is  a  semi-supervised  detection  method,  therefore  it  requires  a 
training  set  of  background  data.  The  algorithm  looks  at  ./V  frames  of  sensor  data  and  each 
new  frame  of  data  replaces  the  oldest  frame.  The  training  set  is  used  as  the  initial  N 
frames  which  are  then  divided  into  C  clusters.  Then  each  pixel  has  the  mean  vector  of  its 
cluster  subtracted  from  it  and  a  covariance  matrix  is  computed  for  the  centered  pixels.  As 
new  frames  of  data  are  included,  the  algorithm  separates  the  new  pixels  into  the  pre¬ 
constructed  clusters,  by  finding  the  cluster  whose  mean  is  closest  to  that  pixel.  This  can 
also  be  stated  as: 

Cn  =  iea5c)(|Zn_m£l)  W 

where  cn  is  the  cluster  to  which  the  nh  pixel,  zn,  is  assigned  and  m,  is  the  mean  of  the  ith 
cluster. 
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Once  all  of  the  new  pixels  have  been  assigned  to  a  cluster,  the  mean  of  each  cluster  is 
recalculated.  Each  pixel  in  the  new  frame  is  then  centered  and  the  covariance  matrix  is 
updated,  using  Equation  10: 

Kj  ~  Ki- 1  (1  “  jv)  +  (jV(P  +  1))  X  XiXi  (10) 

i= 1 

where  Kj  is  the  jth  update  of  the  covariance  matrix,  N  is  the  number  of  frames  included, 

P  is  the  number  of  pixels  in  a  frame,  and  Xj  is  the  centered  ith  pixel. 

2.3  Principal  Components  Analysis 

There  are  many  constraints  on  the  storage  and  analysis  of  HSI  data.  For  example, 
the  storage  capability  of  onboard  systems  may  be  limited  for  the  un-manned  vehicle 
(UAV),  helicopter,  or  airplane  that  is  collecting  the  data.  Also,  much  of  the  analysis  is 
carried  out  on  commercial  off-the-shelf  (COTS)  computers  (Farrell  &  Mersereau,  2005). 
Because  of  these  limitations,  it  is  common  practice  to  conduct  a  data  reduction  technique, 
such  as  principal  components  analysis  (PCA)  (Gu,  Liu,  &  Zhang,  2006),  (Farrell  & 
Mersereau,  2005).  PCA  is  a  type  of  multivariate  statistical  technique  that  is  used  to 
reduce  the  dimensionality  of  a  set  of  data.  Principal  components  are  linear  combination  of 
the  data’s  variables  (Dillon  &  Goldstein,  1984).  The  first  component  is  created  such  that 
the  greatest  amount  of  variance  in  the  data  is  captured  by  the  linear  combination  while 
the  length  equals  one.  The  second  component  is  constructed  orthogonal  to  the  first 
component  and  it  explains  as  much  of  the  remaining  variance  as  possible.  This  process 
continues  until  v  components  are  created  where  v  is  the  number  of  variables  in  the  data 
set.  In  order  to  reduce  dimensionality,  only  some  of  the  principal  components  are 
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retained.  However,  it  is  possible  to  lose  important  knowledge  contained  in  an  image. 
Therefore,  the  goal  is  to  explain  the  most  variance  of  the  original  set  with  as  few 
principal  components  as  possible.  Retained  principal  components  are  then  used  instead  of 
the  variables  in  the  analysis. 

Applying  this  to  HSI,  principal  components  are  found  in  the  data  cube  with  the 
spectral  bands  acting  as  the  variables.  By  using  principal  components,  we  can  achieve 
dimensionality  reductions  of  approximately  90  percent,  while  retaining  much  of  the 
variance  within  the  data.  However,  it  is  important  to  retain  a  sufficient  number  of 
principal  components  in  order  for  anomalies,  which  can  be  a  small  percentage  of  the  data, 
to  be  detected. 


2.4  QR  Factorization 

QR  factorization  is  one  technique  that  can  be  used  to  help  simplify  matrix 
computations  in  the  RX  equation.  How  QR  factorization  accomplishes  this  task  is  shown 
later  in  the  paper. 

A  real  matrix  A  can  be  factored  into  two  matrices:  an  orthogonal  matrix  Q  and  an 
upper  or  right  triangular  matrix  R.  The  factorization  looks  like  Equation  11. 


A  -  QR 


(11) 


There  are  several  ways  to  compute  Q  and  R,  including  the  Givens  rotations.  Householder 
transformations  and  the  Gram-Schmidt  orthogonalization  process  (Golub  &  Van  Loan, 
1989),  (Meyer,  2000). 
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2.5  Performance  Measurements 


In  order  to  measure  the  performance  of  the  anomaly  detection  methods,  we  need 
to  look  at  what  mistakes  the  methods  are  making.  Each  pixel  has  two  possible  true  states, 
either  the  pixel  is  an  anomaly  or  it  is  not.  The  detector  also  can  classify  each  pixel  as  an 
anomaly  or  background.  This  leads  to  four  possible  states  for  the  classification  of  each 
pixel,  which  are  laid  out  in  Figure  6,  commonly  termed  a  confusion  matrix. 


Detector  Classification 


H  Anomaly 

C/3 

9» 

g 

H  Background 


Figure  6:  Confusion  Matrix.  Adapted  from  (Fawcett,  2006) 


Anomaly  Background 


True 

Positive 

False 

Negative 

False 

Positive 

True 

Negative 

As  shown  in  the  confusion  matrix  above,  there  are  two  ways  the  detector  can  be 
correct.  If  a  pixel  is  true  positive  or  true  negative,  then  the  detector  successfully  classified 
the  true  state  of  the  pixel.  However,  if  a  pixel  is  false  positive  or  false  negative,  then  the 
detector  classified  a  background  pixel  as  an  anomaly  or  an  anomalous  pixel  as 
background.  Obviously,  we  are  interested  in  a  detector  that  has  many  true  positives  or 
true  negatives,  with  very  few  false  positives  or  false  negatives. 

We  will  be  particularly  concerned  with  the  true  positive  fraction  (TPF)  and  false 
positive  fraction  (FPF).  The  true  positive  fraction  is  defined  as  the  number  of  true 
positives  divided  by  the  number  of  anomalies.  The  TPF  essentially  is  the  percentage  of 
anomalies  the  detector  has  successfully  identified.  The  false  positive  fraction  is  the 


16 


number  of  false  positives  divided  by  the  number  of  non-anomalies.  This  means  the  FPF  is 
the  percentage  of  non-anomalies  that  are  incorrectly  identified. 

Receiver  operating  characteristic  (ROC)  curves  are  commonly  used  to  assess  the 
accuracy  of  detectors.  A  ROC  curve  is  a  plot  of  the  TPF  vs.  FPF  given  some  incremental 
change  in  the  threshold  (Fawcett,  2006).  A  ROC  curve  shows  how  changing  the  detection 
threshold  affects  how  anomalies  are  detected.  A  lower  threshold  would  give  fewer  false 
alarms,  but  would  detect  very  few  targets  whereas  a  higher  threshold  would  detect  most 
of  the  targets,  but  would  have  many  false  positives. 

In  order  to  compare  several  methods,  each  is  used  with  the  same  changes  in 
threshold.  The  “northwest”  rule  is  commonly  used  as  a  discriminator  (Fawcett,  2006). 
That  is  if  method  A’s  ROC  curve  is  entirely  to  the  north  and/or  west  of  method  B’s  ROC 
curve,  it  is  said  that  method  A  dominates  B.  This  means  that  no  matter  the  threshold, 
method  A  performs  better  than  B.  This  is  why  ROC  curves  are  so  useful  to  compare 
several  detection  methods. 
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3.0  Methodology 


3.1  Chapter  Overview 

This  chapter  outlines  the  assumptions  and  process  we  propose  in  order  to  speed 
up  the  Linear  RX  detector  and  implement  it  as  a  real-time  anomaly  detector. 

3.2  Algorithm  Development 

The  basic  idea  behind  our  algorithm  comes  from  the  very  nature  of  HSI  data 
collection  and  the  Linear  RX  detector.  As  explained  in  Chapter  2,  HSI  sensors  are 
assumed  within  this  research  to  collect  hyperspectral  image  data  using  a  “pushbroom” 
sweep  of  the  area.  Each  sweep  of  the  HSI  sensor  corresponds  to  a  row  of  pixels.  Since  the 
LRX  detector  uses  a  line  of  pixels  for  comparison  instead  of  a  local  window,  the 
sampling  of  the  image  mimics  how  the  image  is  collected.  For  the  purpose  of  this  paper, 
we  will  assume  that  the  sensor  is  collecting  data  from  west  to  east  with  respect  to  the 
picture  (see  Figure  7).  It  is  important  to  note  that  the  original  data  was  collected 
perpendicular  to  this  direction.  However,  this  research  is  using  subsets  of  the  original 
HYDICE  imagery,  so  the  assumed  direction  of  collection  is  not  important.  It  is  also 
important  to  note  that  the  FRX  detector  is  sensitive  to  the  shape  of  the  image.  Since  the 
window  is  a  line,  the  height  of  the  image  plays  a  key  role  in  the  average  distance  between 
the  window  pixels.  This  average  distance  is  the  key  to  reducing  correlation  due  to  spatial 
proximity. 
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Image  of  ARESiD 


Figure  7:  Direction  of  Data  Collection  Assumption 

3.2.1  Changes  to  the  Linear  RX  Anomaly  Detector 

In  order  to  speed  up  the  LRX  detector  we  decided  to  speed  up  the  calculation  of 
the  inverse  covariance  matrix.  This  calculation  is  one  area  of  not  only  the  LRX  detector, 
but  many  Mahalanobis  based  anomaly  detectors  that  has  large  computation  times.  When 
working  with  large  matrices,  calculating  the  covariance  matrix  and  taking  the  inverse  can 
take  a  long  time.  For  the  LRX  algorithm,  over  40%  of  the  run  time  is  spent  calculating 
the  covariance  matrix  and  over  25%  of  the  run  time  is  spent  on  the  score  calculation, 
which  contains  the  covariance  matrix  inverse.  It  has  been  shown  that  the  inverse 
covariance  matrix  can  be  calculated  faster  using  QR  factorization  in  Chang  et  al.  (Chang, 
Ren,  &  Chiang,  2001),  and  by  Du  and  Zhang  (Du  &  Zhang,  201 1). 

The  window  covariance  matrix  is  described  by  Equation  12, 
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5  =  (—5— )  XTX 
\N  -  1/ 


(12) 


where  S  is  the  covariance  matrix,  N  is  the  size  of  the  window,  X  is  an  N  x  p  matrix  of  the 
mean  corrected  data  from  the  window,  an  p  is  the  number  of  spectral  bands.  Recalling 
Equation  12  which  expressed  that  any  matrix,  can  be  factored  through  QR;  Equation  13 
extends  this  to  include  X 


X  -  QR.  (i3) 

We  can  now  express  the  covariance  matrix  in  terms  of  Q  and  R  as  shown  in  Equation  14. 

1  \  .  ...  .  /  1 


5  = 


N 


-)  (QR)t(QR)  =  RTQTQR 


(14) 


Since  Q  is  orthogonal,  QT  =  Q  1.  This  knowledge  simplifies  the  equation  even 
further  as  displayed  in  Equation  15. 

1 


-Grn) 


R  R 


(15) 


Now  we  take  the  inverse  of  both  sides  we  come  up  with  Equation  16. 

S-1  =  (N  -  lXR^XR7)-1  (1, 

Even  though  there  are  still  matrices  to  be  inverted,  they  are  upper  triangular 
matrices  and  therefore  easier  to  invert  than  a  covariance  matrix.  Since  we  only  are 
concerned  in  finding  matrix  R,  Cholesky  Decomposition  is  used  in  this  research  (Meyer, 


2000). 


One  large  difference  made  to  the  LRX  detector  was  to  use  the  Mahalanobis 
distance,  Equation  1,  instead  of  the  usual  RX  score,  Equation  2,  which  is  based  on  the 
general  likelihood  ratio  test.  Since  our  focus  is  on  using  a  different  method  to  calculate 
the  inverse  covariance  matrix,  we  need  the  score  to  be  in  terms  of  the  inverse  covariance 


20 


matrix.  Switching  these  two  equations  should  not  change  the  detection  much  because,  as 
explained  in  chapter  2,  the  RX  score  approaches  Mahalanobis  distance  as  the  window 
size  gets  larger.  This  assertion  was  tested  by  using  Mahalanobis  distance  in  LRX  with  the 
optimal  settings  set  forth  by  Williams  et  al.  in  (Williams,  Bihl,  &  Bauer).  The  ROC 
curves  of  LRX  and  LRX  with  Mahalanobis  distance  were  exactly  the  same.  Therefore, 
Mahalanobis  distance  is  a  good  estimate  to  the  RX  score.  After  replacing  the  inverse 
covariance  matrix  in  Equation  1  with  our  result  in  Equation  16,  we  have  Equation  17. 

D2  =  (x  —  M)r(iV  -  1)C R-1)(/?T)-1(x  -  H)  (17) 

This  is  the  pixel  score  that  will  be  used  in  our  real-time  LRX  detector. 

3.2.2  Assumption  for  Principal  Components  Analysis 

The  Linear  RX  algorithm,  as  well  as  many  detection  algorithms,  benefit  greatly 
by  the  use  of  principal  components  analysis  (PC A).  This  is  easily  done  when  working  on 
an  image  that  has  already  been  collected,  but  is  tricky  when  trying  to  work  on  an  image 
that  is  in  the  process  of  being  collected.  Therefore,  we  decided  to  run  PCA  on  a  set 
number  of  beginning  columns,  say  k  columns,  and  apply  the  loadings  across  all  the 
incoming  data.  One  downside  to  this  approach  is  if  the  background  changes  dramatically 
from  the  first  k  columns  to  later  in  the  image,  then  the  original  PCA  loadings  would  not 
represent  the  rest  of  the  picture  well. 

For  example,  consider  Figure  8,  an  image  that  is  being  collected  where  the  left 
part  is  a  sandy  beach  and  the  right  part  is  ocean.  If  PCA  is  carried  out  on  the  first  k 
columns  and  all  of  those  columns  are  of  the  sand,  then  the  loadings  are  useless  in 
describing  the  part  of  the  image  that  is  ocean.  For  the  scope  of  this  paper,  we  will  assume 
that  the  first  k  columns  of  the  image  will  be  a  good  indication  of  the  background  for  the 
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rest  of  the  image.  In  section  5.3.1,  we  will  propose  a  possible  solution  to  this  problem  that 


would  require  further  study. 

First  k  columns  that 
are  used  for  PC  A 


Figure  8:  Illustration  of  the  Issue  with  PCA  in  Real-Time  Detection 

3.2.3  Decreasing  the  Number  of  Covariance  Matrix  Changes 
In  many  cases,  images  have  a  similar  background  throughout  the  entire  image. 
This  background  could  be  a  grassy  plain,  dessert  area,  or  even  a  forest,  but  the  point  is  we 
may  not  need  to  change  our  window  covariance  matrix  every  time  we  change  the 
window.  If  the  (n+i)th  window  has  similar  data  as  the  nh  window,  then  we  could 
potentially  use  the  covariance  matrix  from  the  nth  window.  This  could  speed  up  the 
calculation  times,  since  we  could  cut  out  the  covariance  matrix  update.  Because  of  this, 
we  decided  to  track  the  covariance  matrix  changes  by  tracking  the  trace  of  the  covariance 
matrix.  If  the  change  from  one  step  to  the  next  was  significantly  large,  we  would  change 
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the  covariance  matrix  for  the  RX  score  calculation.  If  not,  we  would  continue  using  the 
covariance  matrix  we  used  in  the  previous  step. 

3.3  Experimental  Design 

We  expect  that  3  factors  will  play  a  role  in  the  detector’s  accuracy:  the  number  of 
pixels  to  include  in  our  window  (or  line  size),  the  number  of  initial  columns  to  run  PCA 
on,  and  the  number  of  principal  components  to  retain.  We  want  line  size  to  be  scaled 
appropriately  to  the  size  of  the  image.  Because  there  is  a  large  range  of  image  sizes,  we 
will  express  line  size  in  terms  of  the  image  height  (H). 

Table  1:  Settings  Tested  for  Optimal  Model  Accuracy 


Line  Size 

0.5H,  1H,  1.5H,  2H 

Number  of  Columns  in  PCA 

5,50 

Number  of  PCs 

3,4,  5,  6,7,  8,9,  10 

Note  that  if  the  algorithm  waits  for  5  columns  to  be  collected  by  the  sensor  in 
order  to  run  PCA,  then  by  the  time  the  analysis  is  started  on  the  first  pixel,  the  sensor  has 
collected  5  or  more  columns  of  data.  Because  the  largest  line  size  we  would  use  is  2H,  we 
could  start  the  analysis  on  the  first  pixel  of  the  second  column  and  not  have  any  issue 
with  our  window  catching  up  with  the  sensor.  This  of  course  is  assuming  that  the  sensor 
is  just  as  fast  as  the  algorithm.  Since  we  are  not  concerned  about  the  algorithm  catching 
up  with  the  sensor,  we  do  not  need  to  run  a  special  simulation  that  slowly  adds  data  to  the 
image  as  the  image  is  being  analyzed.  We  can  simply  run  the  analysis  on  the  entire 
image. 
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3.4  HYDICE  Hyperspectral  Images 


This  research  uses  several  images  collected  by  an  airborne  HSI  Sensor  called 
HYDICE  (Hyperspectral  Digital  Imagery  Collection  Equipment).  Specifically,  the 
images  come  from  the  Forest  I  and  Desert  Radiance  II  canonical  datasets  from  the 
HYDICE  program’s  1995  data  collection  experiments  (Eismann,  2011),  (Orloff,  et  al., 
2000). 

There  are  two  sets  of  images:  training  and  test  images.  The  training  images  will 
be  used  to  find  the  optimal  settings  for  the  real-time  LRX.  After  these  optimal  settings  are 
found,  the  real-time  LRX  is  used  on  the  test  images  to  validate  the  settings.  Details  on 
each  image  are  displayed  in  Table  2  for  the  training  images  and  in  Table  3  for  the  test 
images.  All  of  the  images  have  210  spectral  bands  and  are  taken  from  an  altitude  of 
5,000’  AGL  (Above  Ground  Level)  except  for  ARES5D_20kFT.  See  Appendix  A  for 
true  color  images  and  truth  maps. 


Table  2:  Details  on  the  HYDICE  Training  Images 


HYDICE 

Image 

Size 

Number 
of  Pixels 

Target 

Pixels 

Total 

Targets 

Scene 

Type 

ARES  ID 

291x199 

57909 

235 

6 

Desert 

ARES  IF 

191x160 

30560 

1007 

10 

Forest 

ARES2D 

215x104 

22360 

523 

46 

Desert 

ARES2F 

312x152 

47424 

307 

30 

Forest 

ARES3F 

226x136 

30736 

314 

20 

Forest 

ARES4F 

205x80 

16400 

109 

29 

Forest 
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Table  3:  Details  on  the  HYDICE  Test  Images 


HYDICE  Image 

Size 

Number 
of  Pixels 

Target 

Pixels 

Total 

Targets 

Scene 

Type 

ARES3D 

156x156 

24336 

438 

4 

Desert 

ARES4 

460x78 

35880 

882 

15 

Desert 

ARES5 

355x150 

53250 

585 

15 

Forest 

ARES5D_20kFT 

139x68 

9450 

129 

28 

Desert 

3.5  Summary 

In  this  chapter,  we’ve  covered  the  development  of  our  real-time  anomaly  detector, 
including  how  we  intend  to  speed  up  the  Linear  RX  detector  using  QR  factorization  to 
calculate  the  inverse  covariance  matrix.  We’ve  also  covered  our  base  assumptions,  such 
as  the  direction  the  data  is  collected  and  that  our  data  has  similar  background  data 
throughout  the  image.  Finally,  we  presented  the  settings  we  will  test  and  the  images  we 
will  use.  Next,  in  chapter  4,  we  will  cover  the  analysis  and  results  of  this  research. 
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4.  Analysis  and  Results 


4.1  Chapter  Overview 

In  this  chapter,  we  will  cover  the  analysis  and  results  for  the  real-time  LRX.  This 
chapter  is  broken  into  four  sections,  which  include  tracking  the  covariance  matrix 
changes,  finding  the  best  settings  for  the  real-time  LRX,  time  savings  of  the  real-time 
LRX  versus  the  original  LRX,  and  the  accuracy  comparison  between  the  real-time  LRX 
and  the  original  LRX. 

4.2  Tracking  the  Covariance  Matrix  Changes 

By  tracking  the  differences  in  the  trace  of  the  window  covariance  matrix,  it  was 
anticipated  to  decrease  the  run  times  of  the  anomaly  detector  by  eliminating  many  of  the 
covariance  matrix  updates. 

We  incorporated  in  the  algorithm  a  check  for  the  change  in  the  trace  of  the 
window  covariance  matrix  each  time  the  window  moves  to  a  new  test  pixel.  After 
looking  at  the  differences  for  one  image,  ARES  IF,  we  noticed  the  differences  were 
either  incredibly  small  or  quite  large.  The  histogram  for  the  differences  is  displayed  in 
Table  4.  Note  that  about  half  the  differences  are  less  than  1.00  x  10-10  and  the  other  half 
are  greater  than  100.  Based  on  this  table,  we  implemented  a  rule  of  thumb  that  if  the  trace 
difference  is  greater  than  1.00  x  108,  then  use  the  new  covariance  matrix  in  the 
calculation  of  the  score.  If  the  trace  difference  is  less  than  1.00  x  108,  then  the 
covariance  matrix  from  the  previous  step  can  be  used.  The  number  of  updates  required 
varies  a  little  by  image,  but  generally  less  than  10%  of  the  covariance  matrices  have  to  be 
updated  using  this  threshold. 
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Table  4:  Histogram  of  the  Differences  in  the  Trace  for  ARES  IF 


Bin 

Frequency 

1.00E-10 

15468 

1.00E-08 

0 

1.00E-06 

0 

1.00E-04 

0 

1.00E-02 

0 

1.00E+02 

0 

1.00E+04 

11 

1.00E+06 

825 

1.00E+08 

13206 

1.00E+10 

1050 

More 

0 

After  implementing  this  check  on  a  couple  of  images,  we  noticed  an  interesting 
side  effect.  By  mapping  the  pixels  that  were  the  center  or  test  pixel  for  the  windows  that 
had  a  large  change  in  the  covariance  matrix,  we  can  see  the  anomalies.  Note  that  the 
results  do  change  with  different  cutoff  values  and  different  settings.  To  create  Figures  9 
and  10,  we  retained  9  PCs  and  had  a  line  size  of  2H,  where  H  is  the  height  of  the  image. 

In  addition  to  finding  anomalies,  we  see  in  Figure  10  that  the  tree  line  and  the 
roads  are  picked  up  as  well.  This  method  appears  to  pick  up  large  spectral  changes  from 
one  pixel  to  another.  This  would  explain  why  anomalies  as  well  as  tree  lines  and  roads 
are  picked  up. 
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Figure  9:  ARES  2D  Image  and  Mapping  of  Pixels  with  a  Large  Trace  Difference 


Figure  10:  ARES  3F  Image  and  Mapping  of  Pixels  with  a  Large  Trace  Difference 
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Although  this  method  eliminates  about  half  of  the  covariance  matrix  updates,  it 
does  not  create  any  time  savings.  Since  this  method  tracks  the  covariance  matrix,  the 
steps  taken  to  track  the  matrix  are  so  computationally  intensive  that  the  entire  algorithm 
took  longer  to  run.  Because  of  the  increased  run  time,  it  was  decided  to  not  include  this 
method  into  our  algorithm. 

4.3  Finding  the  Best  Settings 

In  order  to  determine  the  best  settings  for  the  real-time  LRX,  the  detector  was 
used  on  each  of  the  6  training  images  using  all  of  possible  combinations  of  the  settings 
listed  in  Table  1.  It  was  decided  to  find  the  best  average  true  positive  fraction  (TPF)  for  a 
false  positive  fraction  (FPF)  of  0.1.  If  the  FPF  was  much  higher  than  0.1,  then  the 
detector  would  not  be  very  useful  since  it  is  identifying  over  10%  of  the  background 
pixels  as  anomalies.  We  are  looking  for  the  best  average  TPF  because  we  want  the  best 
settings  for  all  of  the  images,  not  just  one  particular  image.  First  we  found  best  settings 
with  the  first  5  columns  used  for  PCA  for  the  whole  image.  The  settings  with  the  best 
average  TPF  were  found  to  be  10  PCs  and  a  line  size  of  1H  with  an  average  TPF  of 
0.8560.  The  same  method  was  used,  but  using  the  first  50  columns  for  PCA  and  the  same 
settings  were  found  as  optimal  with  an  average  TPF  of  0.8679.  Although  the  average  TPF 
is  larger  for  the  method  using  the  first  50  columns,  the  difference  is  not  great  and  it  would 
not  be  practical  to  wait  for  50  columns  to  be  collected  before  the  image  could  even  start 
to  be  analyzed.  For  this  reason  the  optimal  settings  we  will  use  are  5  columns  for  PCA, 
retaining  10  PCs  and  a  line  size  of  1H. 
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4.4  Time  Savings 

After  determining  the  best  settings,  we  used  the  real-time  LRX  on  the  4  test 
images  described  in  chapter  3.  The  run  time  was  recorded  for  20  runs  and  an  average  was 
taken  so  we  have  an  average  run  time  per  image.  An  average  run  time  was  collected  for 
each  image  using  LRX  as  well,  using  the  best  settings  described  by  Williams  et  al. 
(Williams,  Bihl,  &  Bauer).  The  results  are  displayed  in  Table  5.  Overall,  the  time  savings 
is  approximately  40%  for  each  image.  Although  it  may  only  be  a  few  seconds  in  each 
case,  such  improvement  can  add  up  over  time. 


Table  5:  Average  Run  Time  in  Seconds  for  LRX  and  LRX  with  QR 


ARES3D 

ARES  4 

ARES  5 

ARES5D_20kFT 

LRX 

Average 

4.97 

8.91 

12.57 

1.91 

Standard 

Deviation 

0.03 

0.02 

0.03 

0.02 

LRX  with  QR 

Average 

2.65 

5.68 

7.59 

1.02 

Standard 

Deviation 

0.01 

0.01 

0.01 

0.00 

Time  Savings 

2.32 

3.23 

4.98 

0.89 

4.5  Accuracy 

Using  the  best  settings,  we  used  the  real-time  LRX  and  LRX  on  the  test  images 
and  created  ROC  curves  to  compare  the  accuracy  of  the  methods.  Figures  1 1  through  14 
show  the  ROC  graphs.  For  all  the  graphs,  the  real-time  LRX  ROC  is  below  the  LRX 
ROC.  This  suggests  that  the  accuracy  is  slightly  worse  with  the  real-time  LRX.  However, 
the  difference  is  minimal  and  probably  due  to  the  use  of  5  columns  for  PCA  rather  than 
the  entire  image.  The  only  ROC  with  any  significant  difference  is  ARES5D_20kFT  and  it 
is  likely  the  difference  is  due  to  the  height  the  image  was  taken  from.  Since  the  detectors 
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were  trained  on  images  that  were  taken  5,000’  AGL,  it’s  reasonable  to  assume  there  will 
be  a  difference  when  the  detector  is  used  on  an  image  taken  at  20,000’  AGL. 
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Comparison  of  Methods  on  ARES  3D 


FPF 


Figure  11:  ROC  Comparison  of  Methods  on  ARES  3D 


Comparison  of  Methods  on  ARES  4 


Figure  12:  ROC  Comparison  of  Methods  on  ARES  4 
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Comparison  of  Methods  on  ARES  5 


Figure  13:  ROC  Comparison  of  Methods  on  ARES  5 

Comparing  Methods  on  ARES  5D  20kFT 


Figure  14:  ROC  Comparison  of  Methods  on  ARES  5D  20kFT 
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4.6  Summary 

In  this  chapter,  we  saw  that  tracking  the  difference  in  the  trace  of  the  covariance 
matrix  from  one  window  to  another  does  not  save  much  time  although  it  has  the  nice  side 
effect  of  being  able  to  pick  out  some  major  spectral  differences.  After  running  32 
possibilities  on  6  training  images,  the  best  average  true  positive  fraction  was  found  for  a 
false  positive  fraction  of  0.1.  The  settings  for  this  best  average  TPF  were  decided  to  be 
the  best  settings  for  the  real-time  LRX  and  consisted  of  using  10  PCs  and  a  line  size  of 
2H,  where  H  is  the  height  of  the  image.  The  same  best  settings  were  found  using  5 
columns  for  PCA  and  50  columns.  We  chose  to  go  with  5  columns  for  PCA,  since  the 
goal  is  to  analyze  the  image  as  close  to  real-time  as  possible.  With  these  best  settings, 
there  was  a  time  savings  of  approximately  40%  between  the  real-time  LRX  and  the  LRX. 
The  accuracy  was  a  little  worse,  but  not  enough  to  cause  much  concern. 
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5.  Conclusions  and  Recommendations 


5.1  Chapter  Overview 

This  chapter  lists  the  conclusions  of  the  research  as  well  as  recommendations  for 
future  research. 

5.2  Conclusions  of  Research 

Using  QR  factorization  to  replace  the  inverse  covariance  matrix  with  an  inverse  of 
an  upper  triangular  matrix  greatly  improves  the  speed  of  the  Linear  RX  anomaly  detector. 
With  approximately  40%  time  savings,  the  LRX  detector  with  QR  factorization  could 
potentially  be  used  as  a  real-time  anomaly  detector.  The  accuracy  of  the  real-time  LRX 
detector  was  shown  to  be  about  as  good  as  the  LRX  detector.  Accuracy  of  the  real-time 
LRX  was  only  slightly  worse  than  the  LRX  and  was  probably  due  to  the  assumptions 
made  about  using  Mahalanobis  distance  and  using  5  columns  for  PCA. 

5.3  Recommendations  for  Future  Research 

5.3.1  Proposed  Process  for  Situation  that  Violate  the  Assumptions 
In  order  to  use  principal  components  analysis  (PCA)  on  the  images  in  real-time, 
we  made  the  assumption  that  the  first  k  columns  of  the  image  are  a  good  indicator  of 
what  to  expect  from  the  rest  of  the  image.  We  used  these  first  k  columns  to  run  PCA  on 
and  apply  the  loadings  across  the  rest  of  the  pixels. 

In  the  case  that  this  assumption  cannot  be  met,  we  suggest  that  the  covariance 
matrix  associated  with  PCA  be  tracked  and  compared  to  the  original.  This  could  be  done 
one  of  two  ways.  As  the  sensor  collects  more  data,  the  covariance  matrix  could  be  from 
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the  last  k  columns  collected  or  the  covariance  matrix  could  be  from  the  part  of  the  image 
that  has  been  collected  so  far.  Whichever  covariance  matrix  is  used,  it  should  be 
compared  to  the  covariance  matrix  used  in  the  original  PCA.  If  a  significant  difference 
occurs,  then  the  PCA  needs  to  be  redone.  Future  research  could  focus  on  how  to  track  the 
covariance  matrices  and  which  covariance  matrices  to  track. 

5.3.2  Trace  of  the  Covariance  Matrix 

As  discussed  in  chapter  4,  the  method  of  tracking  the  covariance  matrix  picked  up 
on  large  spectral  differences  from  pixel  to  pixel.  It  is  possible  to  find  a  way  to  track  the 
covariance  matrix  in  a  more  time  efficient  manner.  This  method  could  then  be 
incorporated  to  produce  even  faster  anomaly  detector  algorithms.  It  may  also  be  possible 
to  use  this  method  as  a  pre-processor  or  a  rudimentary  anomaly  detector,  since  it  picks  up 
large  spectral  differences. 

5.3.3  Step  Changes  to  the  Covariance  Matrix 

There  are  methods  out  there  that  make  changes  to  mean  and  covariance  methods 
at  each  step.  So  in  our  case,  as  we  move  on  to  a  new  test  pixel,  the  covariance  matrix 
would  be  updated  by  update  equations  rather  than  re-calculating  the  covariance  matrix 
every  single  time.  One  example  of  this  type  of  step  change  is  Algorithm  AS  41  by  Clarke 
(Clarke,  1971). 


36 


Appendix  A 


Figure  15:  Color  Image  for  ARES  ID 
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Figure  16:  Color  Image  for  ARES  IF 
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Figure  17:  Color  Image  for  ARES  2D 
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Figure  18:  Color  Image  for  ARES  2F 
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Figure  19:  Color  Image  for  ARES  3D 
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Figure  20:  Color  Image  for  ARES  3F 
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Figure  21:  Color  Image  for  ARES  4 
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Figure  22:  Color  Image  for  ARES  4F 
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Figure  23:  Color  Image  for  ARES  5 
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Figure  24:  Color  Image  for  ARES  5D_20kFT 
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Using  QR  Factorization  for  Real-Time 
Anomaly  Detection  in  Hyperspectral  Images 


Appendix  B 
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